Measuring Audio and Visual Speech Synchrony: Methods and Applications
نویسندگان
چکیده
Speech is a means of communication that is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech and more specifically on techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual or joint audiovisual feature spaces and the actual measure of correspondence between audio and visual speech. Finally, applications of this specific task are described.
منابع مشابه
Measuring Audio and Visual Speech Synchrony: Methods
Speech is a means of communication that is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech and more specifically on techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, tran...
متن کاملRobust audio-visual speech synchrony detection by generalized bimodal linear prediction
We study the problem of detecting audio-visual synchrony in video segments containing a speaker in frontal head pose. The problem holds a number of important applications, for example speech source localization, speech activity detection, speaker diarization, speech source separation, and biometric spoofing detection. In particular, we build on earlier work, extending our previously proposed ti...
متن کاملAn audio-visual approach to measuring discourse synchrony in multimodal conversation data
This paper describes recent work on the automatic extraction of visual and audio parameters relating to the detection of synchrony in discourse, and to the modelling of active listening for advanced speech technology. It reports findings based on image processing that reliably identify the strong entrainment between members of a group conversation, and describes techniques for the extraction an...
متن کاملAn Alternative to Low-Level-Synchrony-Based Methods for Speech Detection
Determining whether someone is talking has applications in many areas such as speech recognition, speaker diarization, social robotics, facial expression recognition, and human computer interaction. One popular approach to this problem is audio-visual synchrony detection [10, 21, 12]. A candidate speaker is deemed to be talking if the visual signal around that speaker correlates with the audito...
متن کاملModeling the Synchrony between Audio and Visual Modalities for Speaker Identification
This work aims to understand and model the inter-modal temporal relations between the audio and visual modalities of speech and validate whether the captured relations can improve the performance of audio-visual bimodal modeling for such applications as audio-visual speaker identification. We propose to extend our audio-visual correlative model (AVCM) with explicit durational modeling of the pa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006